Structural Linguistics and Unsupervised Information Extraction

نویسنده

  • Ralph Grishman
چکیده

A precondition for extracting information from large text corpora is discovering the information structures underlying the text. Progress in this direction is being made in the form of unsupervised information extraction (IE). We describe recent work in unsupervised relation extraction and compare its goals to those of grammar discovery for science sublanguages. We consider what this work on grammar discovery suggests for future directions in unsupervised IE.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acquiring Topic Features to improve Event Extraction: in Pre-selected and Balanced Collections

Event extraction is a particularly challenging type of information extraction (IE) that may require inferences from the whole article. However, most current event extraction systems rely on local information at the phrase or sentence level, and do not consider the article as a whole, thus limiting extraction performance. Moreover, most annotated corpora are artificially enriched to include enou...

متن کامل

Unsupervised Active Learning of CRF Model for Cross Lingual Information Extraction

Manual annotation of the training data of information extraction models is a time consuming and expensive process but necessary for the building of information extraction systems. Active learning has been proven to be effective in reducing manual annotation efforts for supervised learning tasks where a human judge is asked to annotate the most informative examples with respect to a given model....

متن کامل

LoLo: A System Based On Terminology For Multilingual Extraction

An unsupervised learning method, based on corpus linguistics and special language terminology, is described that can extract time-varying information from text streams. The method is shown to be ‘language-independent’ in that its use leads to sets of regular-expressions that can be used to extract the information in typologically distinct languages like English and Arabic. The method uses the i...

متن کامل

Unsupervised Learning of Contextual Role Knowledge for Coreference Resolution

We present a coreference resolver called BABAR that uses contextual role knowledge to evaluate possible antecedents for an anaphor. BABAR uses information extraction patterns to identify contextual roles and creates four contextual role knowledge sources using unsupervised learning. These knowledge sources determine whether the contexts surrounding an anaphor and antecedent are compatible. BABA...

متن کامل

Semantic Similarity: What For?

Linguistic similarity has been a prominent notion and tool in computational linguistics and related areas, as elaborated nicely in the announcement of this workshop. Yet, what exactly counts as “similarity”, or when two linguistic concepts should be regarded as similar, often remains rather vague and ill posed, which is in fact quite typical for unsupervised notions. This talk will focus on sim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012